AITopics | kernel alignment

Collaborating Authors

kernel alignment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards a Learning Theory of Representation Alignment

Insulla, Francesco, Huang, Shuo, Rosasco, Lorenzo

arXiv.org Machine LearningFeb-19-2025

It has recently been argued that AI models' representations are becoming aligned as their scale and performance increase. Empirical analyses have been designed to support this idea and conjecture the possible alignment of different representations toward a shared statistical model of reality. In this paper, we propose a learning-theoretic perspective to representation alignment. First, we review and connect different notions of alignment based on metric, probabilistic, and spectral ideas. Then, we focus on stitching, a particular approach to understanding the interplay between different representations in the context of a task. Our main contribution here is relating properties of stitching to the kernel alignment of the underlying representation. Our results can be seen as a first step toward casting representation alignment as a learning-theoretic problem.

alignment, kernel alignment, representation, (12 more...)

arXiv.org Machine Learning

2502.14047

Country:

Asia > Middle East > Jordan (0.04)
Europe > Italy > Liguria > Genoa (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Mitigating exponential concentration in covariant quantum kernels for subspace and real-world data

Agliardi, Gabriele, Cortiana, Giorgio, Dekusar, Anton, Ghosh, Kumar, Mohseni, Naeimeh, O'Meara, Corey, Valls, Víctor, Yogaraj, Kavitha, Zhuk, Sergiy

arXiv.org Artificial IntelligenceDec-10-2024

Fidelity quantum kernels have shown promise in classification tasks, particularly when a group structure in the data can be identified and exploited through a covariant feature map. In fact, there exist classification problems on which covariant kernels provide a provable advantage, thus establishing a separation between quantum and classical learners. However, their practical application poses two challenges: on one side, the group structure may be unknown and approximate in real-world data, and on the other side, scaling to the `utility' regime (above 100 qubits) is affected by exponential concentration. In this work, we address said challenges by applying fidelity kernels to real-world data with unknown structure, related to the scheduling of a fleet of electric vehicles, and to synthetic data generated from the union of subspaces, which is then close to many relevant real-world datasets. Furthermore, we propose a novel error mitigation strategy specifically tailored for fidelity kernels, called Bit Flip Tolerance (BFT), to alleviate the exponential concentration in our utility-scale experiments. Our multiclass classification reaches accuracies comparable to classical SVCs up to 156 qubits, thus constituting the largest experimental demonstration of quantum machine learning on IBM devices to date. For the real-world data experiments, the effect of the proposed BFT becomes manifest on 40+ qubits, where mitigated accuracies reach 80%, in line with classical, compared to 33% without BFT. Through the union-of-subspace synthetic dataset with 156 qubits, we demonstrate a mitigated accuracy of 80%, compared to 83% of classical models, and 37% of unmitigated quantum, using a test set of limited size.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.07915

Country:

Africa > Middle East > Morocco > Marrakesh-Safi Region > Marrakesh (0.04)
Europe > Italy (0.04)
Europe > Germany > Lower Saxony > Hanover (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry:

Energy (1.00)
Transportation > Ground > Road (0.54)
Transportation > Electric Vehicle (0.54)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bayesian Comparisons Between Representations

Schütt, Heiko H.

arXiv.org Artificial IntelligenceNov-13-2024

Which neural networks are similar is a fundamental question for both machine learning and neuroscience. Our novel method compares representations based on Bayesian statistics about linear readouts from the representations. Concretely, we suggest to use the total variation distance or Jensen-Shannon distance between prior predictive distributions to compare representations. The prior predictive distribution is a full description of the inductive bias and generalization of a model in Bayesian statistics, making it a great basis for comparisons. As Jensen-Shannon distance and total variation distance are metrics our dissimilarity measures are pseudo-metrics for representations. For a linear readout, our metrics just depend on the linear kernel matrix of the representations. Thus, our metrics connects linear read-out based comparisons to kernel based metrics like centered kernel alignment and representational similarity analysis. We apply our new metrics to deep neural networks trained on ImageNet-1k. Our new metrics can be computed efficiently including a stochastic gradient without dimensionality reductions of the representations. It broadly agrees with existing metrics, but is more stringent. It varies less across different random image samples, and it measures how well two representations could be distinguished based on a linear read out. Thus our metric nicely extends our toolkit for comparing representations.

predictive distribution, representation, variance, (15 more...)

arXiv.org Artificial Intelligence

2411.08739

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis

Li, Junfan, Liao, Shizhong

arXiv.org Artificial IntelligenceJul-2-2024

Online kernel selection is a fundamental problem of online kernel methods.In this paper,we study online kernel selection with memory constraint in which the memory of kernel selection and online prediction procedures is limited to a fixed budget. An essential question is what is the intrinsic relationship among online learnability, memory constraint, and data complexity? To answer the question,it is necessary to show the trade-offs between regret and memory constraint.Previous work gives a worst-case lower bound depending on the data size,and shows learning is impossible within a small memory constraint.In contrast, we present distinct results by offering data-dependent upper bounds that rely on two data complexities:kernel alignment and the cumulative losses of competitive hypothesis.We propose an algorithmic framework giving data-dependent upper bounds for two types of loss functions.For the hinge loss function,our algorithm achieves an expected upper bound depending on kernel alignment.For smooth loss functions,our algorithm achieves a high-probability upper bound depending on the cumulative losses of competitive hypothesis.We also prove a matching lower bound for smooth loss functions.Our results show that if the two data complexities are sub-linear,then learning is possible within a small memory constraint.Our algorithmic framework depends on a new buffer maintaining framework and a reduction from online kernel selection to prediction with expert advice. Finally,we empirically verify the prediction performance of our algorithms on benchmark datasets.

algorithm, loss function, memory constraint, (14 more...)

arXiv.org Artificial Intelligence

2407.00916

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.34)

Add feedback

QUACK: Quantum Aligned Centroid Kernel

Tscharke, Kilian, Issel, Sebastian, Debus, Pascal

arXiv.org Artificial IntelligenceMay-1-2024

Quantum computing (QC) seems to show potential for application in machine learning (ML). In particular quantum kernel methods (QKM) exhibit promising properties for use in supervised ML tasks. However, a major disadvantage of kernel methods is their unfavorable quadratic scaling with the number of training samples. Together with the limits imposed by currently available quantum hardware (NISQ devices) with their low qubit coherence times, small number of qubits, and high error rates, the use of QC in ML at an industrially relevant scale is currently impossible. As a small step in improving the potential applications of QKMs, we introduce QUACK, a quantum kernel algorithm whose time complexity scales linear with the number of samples during training, and independent of the number of training samples in the inference stage. In the training process, only the kernel entries for the samples and the centers of the classes are calculated, i.e. the maximum shape of the kernel for n samples and c classes is (n, c). During training, the parameters of the quantum kernel and the positions of the centroids are optimized iteratively. In the inference stage, for every new sample the circuit is only evaluated for every centroid, i.e. c times. We show that the QUACK algorithm nevertheless provides satisfactory results and can perform at a similar level as classical kernel methods with quadratic scaling during training. In addition, our (simulated) algorithm is able to handle high-dimensional datasets such as MNIST with 784 features without any dimensionality reduction.

algorithm, dataset, kernel, (14 more...)

arXiv.org Artificial Intelligence

2405.00304

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Kernel Alignment for Unsupervised Feature Selection via Matrix Factorization

Lin, Ziyuan, Needell, Deanna

arXiv.org Artificial IntelligenceMar-13-2024

By removing irrelevant and redundant features, feature selection aims to find a good representation of the original features. With the prevalence of unlabeled data, unsupervised feature selection has been proven effective in alleviating the so-called curse of dimensionality. Most existing matrix factorization-based unsupervised feature selection methods are built upon subspace learning, but they have limitations in capturing nonlinear structural information among features. It is well-known that kernel techniques can capture nonlinear structural information. In this paper, we construct a model by integrating kernel functions and kernel alignment, which can be equivalently characterized as a matrix factorization problem. However, such an extension raises another issue: the algorithm performance heavily depends on the choice of kernel, which is often unknown a priori. Therefore, we further propose a multiple kernel-based learning method. By doing so, our model can learn both linear and nonlinear similarity information and automatically generate the most appropriate kernel. Experimental analysis on real-world data demonstrates that the two proposed methods outperform other classic and state-of-the-art unsupervised feature selection methods in terms of clustering results and redundancy reduction in almost all datasets tested.

algorithm, feature selection, selection, (12 more...)

arXiv.org Artificial Intelligence

2403.14688

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Greedy feature selection: Classifier-dependent feature selection via greedy methods

Camattari, Fabiana, Guastavino, Sabrina, Marchetti, Francesco, Piana, Michele, Perracchione, Emma

arXiv.org Machine LearningMar-8-2024

The purpose of this study is to introduce a new approach to feature ranking for classification tasks, called in what follows greedy feature selection. In statistical learning, feature selection is usually realized by means of methods that are independent of the classifier applied to perform the prediction using that reduced number of features. Instead, greedy feature selection identifies the most important feature at each step and according to the selected classifier. In the paper, the benefits of such scheme are investigated theoretically in terms of model capacity indicators, such as the Vapnik-Chervonenkis (VC) dimension or the kernel alignment, and tested numerically by considering its application to the problem of predicting geo-effective manifestations of the active Sun.

classifier, feature selection, selection, (15 more...)

arXiv.org Machine Learning

2403.05138

Country:

Asia > Singapore (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Efficient Parameter Optimisation for Quantum Kernel Alignment: A Sub-sampling Approach in Variational Training

Sahin, M. Emre, Symons, Benjamin C. B., Pati, Pushpak, Minhas, Fayyaz, Millar, Declan, Gabrani, Maria, Robertus, Jan Lukas, Mensa, Stefano

arXiv.org Artificial IntelligenceJan-5-2024

Quantum machine learning with quantum kernels for classification problems is a growing area of research. Recently, quantum kernel alignment techniques that parameterise the kernel have been developed, allowing the kernel to be trained and therefore aligned with a specific dataset. While quantum kernel alignment is a promising technique, it has been hampered by considerable training costs because the full kernel matrix must be constructed at every training iteration. Addressing this challenge, we introduce a novel method that seeks to balance efficiency and performance. We present a sub-sampling training approach that uses a subset of the kernel matrix at each training step, thereby reducing the overall computational cost of the training. In this work, we apply the sub-sampling method to synthetic datasets and a real-world breast cancer dataset and demonstrate considerable reductions in the number of circuits required to train the quantum kernel while maintaining classification accuracy.

dataset, iteration, kernel, (16 more...)

arXiv.org Artificial Intelligence

2401.02879

Country:

Europe > United Kingdom (0.28)
Africa > Kenya > Nairobi City County > Nairobi (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > Promising Solution (0.86)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

An Adaptive Tangent Feature Perspective of Neural Networks

LeJeune, Daniel, Alemohammad, Sina

arXiv.org Artificial IntelligenceDec-28-2023

In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimization problem has an equivalent linearly constrained optimization with structured regularization that encourages approximately low rank solutions. Specializing to neural network structure, we gain insights into how the features and thus the kernel function change, providing additional nuance to the phenomenon of kernel alignment when the target function is poorly represented using tangent features. In addition to verifying our theoretical observations in real neural networks on a simple regression problem, we empirically show that an adaptive feature implementation of tangent feature classification has an order of magnitude lower sample complexity than the fixed tangent feature model on MNIST and CIFAR-10.

kernel, neural network, tangent feature, (14 more...)

arXiv.org Artificial Intelligence

2308.15478

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Algorithms for the Training of Neural Support Vector Machines

Simon, Lars, Radons, Manuel

arXiv.org Artificial IntelligenceAug-14-2023

Neural support vector machines (NSVMs) allow for the incorporation of domain knowledge in the design of the model architecture. In this article we introduce a set of training algorithms for NSVMs that leverage the Pegasos algorithm and provide a proof of concept by solving a set of standard machine learning tasks.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.07204

Country:

North America > United States > California (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback